244 research outputs found

    Thirty years of progeny from Chao’s inequality: Estimating and comparing richness with incidence data and incomplete sampling

    Get PDF
    In the context of capture-recapture studies, Chao (1987) derived an inequality among capture frequency counts to obtain a lower bound for the size of a population based on individuals’ capture/non-capture records for multiple capture occasions. The inequality has been applied to obtain a non-parametric lower bound of species richness of an assemblage based on species incidence (detection/non-detection) data in multiple sampling units. The inequality implies that the number of undetected species can be inferred from the species incidence frequency counts of the uniques (species detected in only one sampling unit) and duplicates (species detected in exactly two sampling units). In their pioneering paper, Colwell and Coddington (1994) gave the name “Chao2” to the estimator for the resulting species richness. (The “Chao1” estimator refers to a similar type of estimator based on species abundance data). Since then, the Chao2 estimator has been applied to many research fields and led to fruitful generalizations. Here, we first review Chao’s inequality under various models and discuss some related statistical inference questions: (1) Under what conditions is the Chao2 estimator an unbiased point estimator? (2) How many additional sampling units are needed to detect any arbitrary proportion (including 100%) of the Chao2 estimate of asymptotic species richness? (3) Can other incidence frequency counts be used to obtain similar lower bounds? We then show how the Chao2 estimator can be also used to guide a non-asymptotic analysis in which species richness estimators can be compared for equally-large or equally-complete samples via sample-size-based and coverage-based rarefaction and extrapolation. We also review the generalization of Chao’s inequality to estimate species richness under other sampling-without-replacement schemes (e.g. a set of quadrats, each surveyed only once), to obtain a lower bound of undetected species shared between two or multiple assemblages, and to allow inferences about undetected phylogenetic richness (the total length of undetected branches of a phylogenetic tree connecting all species), with associated rarefaction and extrapolation. A small empirical dataset for Australian birds is used for illustration, using online software SpadeR, iNEXT, and PhD

    The flower mites of Trinidad III: The genus Rhinoseius (Acari: Ascidae)

    Full text link
    http://deepblue.lib.umich.edu/bitstream/2027.42/56428/1/MP184.pd

    Ecological and biogeographic null hypotheses for comparing rarefaction curves

    Get PDF
    The statistical framework of rarefaction curves and asymptotic estimators allows for an effective standardization of biodiversity measures. However, most statistical analyses still consist of point comparisons of diversity estimators for a particular sampling level. We introduce new randomization methods that incorporate sampling variability encompassing the entire length of the rarefaction curve and allow for statistical comparison of i ≥ 2 individual-based, sample-based, or coverage-based rarefaction curves. These methods distinguish between two distinct null hypotheses: the ecological null hypothesis (H0eco) and the biogeographical null hypothesis (H0biog). H0eco states that the i samples were drawn from a single assemblage, and any differences among them in species richness, composition, or relative abundance reflect only sampling effects. H0biog states that the i samples were drawn from assemblages that differ in their species composition but share similar species richness and species abundance distributions. To test H0eco, we created a composite rarefaction curve by summing the abundances of all species from the i samples. We then calculated a test statistic Zeco, the (cumulative) summed areas of difference between each of the i individual curves and the composite curve. For H0biog, the test statistic Zbiog was calculated by summing the area of difference between all possible pairs of the i individual curves. Bootstrap sampling from the composite curve (H0eco) or random sampling from different simulated assemblages using alternative abundance distributions (H0biog) was used to create the null distribution of Z, and to provide a frequentist test of ZjH0. Rejection of H0eco does not pinpoint whether the samples differ in species richness, species composition, and/or relative abundance. In benchmark comparisons, both tests performed satisfactorily against artificial data sets randomly drawn from a single assemblage (low Type I error). In benchmark comparisons with different species abundance distributions and richness, the tests had adequate power to detect differences among curves (low Type II error), although power diminished at small sample sizes and for small differences among underlying species rank abundances

    Estimating the Richness of a Population When the Maximum Number of Classes Is Fixed: A Nonparametric Solution to an Archaeological Problem

    Get PDF
    Background: Estimating assemblage species or class richness from samples remains a challenging, but essential, goal. Though a variety of statistical tools for estimating species or class richness have been developed, they are all singly-bounded: assuming only a lower bound of species or classes. Nevertheless there are numerous situations, particularly in the cultural realm, where the maximum number of classes is fixed. For this reason, a new method is needed to estimate richness when both upper and lower bounds are known. Methodology/Principal Findings: Here, we introduce a new method for estimating class richness: doubly-bounded confidence intervals (both lower and upper bounds are known). We specifically illustrate our new method using the Chao1 estimator, rarefaction, and extrapolation, although any estimator of asymptotic richness can be used in our method. Using a case study of Clovis stone tools from the North American Lower Great Lakes region, we demonstrate that singly-bounded richness estimators can yield confidence intervals with upper bound estimates larger than the possible maximum number of classes, while our new method provides estimates that make empirical sense. Conclusions/Significance: Application of the new method for constructing doubly-bound richness estimates of Clovis stone tools permitted conclusions to be drawn that were not otherwise possible with singly-bounded richness estimates, namely, that Lower Great Lakes Clovis Paleoindians utilized a settlement pattern that was probably more logistical in nature than residential. However, our new method is not limited to archaeological applications. It can be applied to any set of data for which there is a fixed maximum number of classes, whether that be site occupancy models, commercial products (e.g. athletic shoes), or census information (e.g. nationality, religion, age, race)

    Modeling the ecology and evolution of biodiversity: Biogeographical cradles, museums, and graves

    Get PDF
    Individual processes shaping geographical patterns of biodiversity are increasingly understood, but their complex interactions on broad spatial and temporal scales remain beyond the reach of analytical models and traditional experiments. To meet this challenge, we built a spatially explicit, mechanistic simulation model implementing adaptation, range shifts, fragmentation, speciation, dispersal, competition, and extinction, driven by modeled climates of the past 800,000 years in South America. Experimental topographic smoothing confirmed the impact of climate heterogeneity on diversification. The simulations identified regions and episodes of speciation (cradles), persistence (museums), and extinction (graves). Although the simulations had no target pattern and were not parameterized with empirical data, emerging richness maps closely resembled contemporary maps for major taxa, confirming powerful roles for evolution and diversification driven by topography and climate

    Rarefaction and extrapolation with Hill numbers: A framework for sampling and estimation in species diversity studies

    Get PDF
    Quantifying and assessing changes in biological diversity are central aspects of many ecological studies, yet accurate methods of estimating biological diversity from sampling data have been elusive. Hill numbers, or the effective number of species, are increasingly used to characterize the taxonomic, phylogenetic, or functional diversity of an assemblage. However, empirical estimates of Hill numbers, including species richness, tend to be an increasing function of sampling effort and, thus, tend to increase with sample completeness. Integrated curves based on sampling theory that smoothly link rarefaction (interpolation) and prediction (extrapolation) standardize samples on the basis of sample size or sample completeness and facilitate the comparison of biodiversity data. Here we extended previous rarefaction and extrapolation models for species richness (Hill number qD, where q = 0) to measures of taxon diversity incorporating relative abundance (i.e., for any Hill number qD, q \u3e 0) and present a unified approach for both individual-based (abundance) data and samplebased (incidence) data. Using this unified sampling framework, we derive both theoretical formulas and analytic estimators for seamless rarefaction and extrapolation based on Hill numbers. Detailed examples are provided for the first three Hill numbers: q = 0 (species richness), q = 1 (the exponential of Shannon\u27s entropy index), and q = 2 (the inverse of Simpson\u27s concentration index). We developed a bootstrap method for constructing confidence intervals around Hill numbers, facilitating the comparison of multiple assemblages of both rarefied and extrapolated samples. The proposed estimators are accurate for both rarefaction and short-range extrapolation. For long-range extrapolation, the performance of the estimators depends on both the value of q and on the extrapolation range. We tested our methods on simulated data generated from species abundance models and on data from large species inventories. We also illustrate the formulas and estimators using empirical data sets from biodiversity surveys of temperate forest spiders and tropical ants. © 2014 by the Ecological Society of America

    Unveiling the species-rank abundance distribution by generalizing the Good-Turing sample coverage theory

    Get PDF
    Based on a sample of individuals, we focus on inferring the vector of species relative abundance of an entire assemblage and propose a novel estimator of the complete species-rank abundance distribution (RAD). Nearly all previous estimators of the RAD use the conventional plug-in estimator pi(sample relative abundance) of the true relative abundance piof species i. Because most biodiversity samples are incomplete, the plug-in estimators are applied only to the subset of species that are detected in the sample. Using the concept of sample coverage and its generalization, we propose a new statistical framework to estimate the complete RAD by separately adjusting the sample relative abundances for the set of species detected in the sample and estimating the relative abundances for the set of species undetected in the sample but inferred to be present in the assemblage. We first show that piis a positively biased estimator of pifor species detected in the sample, and that the degree of bias increases with increasing relative rarity of each species. We next derive a method to adjust the sample relative abundance to reduce the positive bias inherent in pi. The adjustment method provides a nonparametric resolution to the longstanding challenge of characterizing the relationship between the true relative abundance in the entire assemblage and the observed relative abundance in a sample. Finally, we propose a method to estimate the true relative abundances of the undetected species based on a lower bound of the number of undetected species. We then combine the adjusted RAD for the detected species and the estimated RAD for the undetected species to obtain the complete RAD estimator. Simulation results show that the proposed RAD curve can unveil the true RAD and is more accurate than the empirical RAD. We also extend our method to incidence data. Our formulas and estimators are illustrated using empirical data sets from surveys of forest spiders (for abundance data) and soil ciliates (for incidence data). The proposed RAD estimator is also applicable to estimating various diversity measures and should be widely useful to analyses of biodiversity and community structure

    Models and estimators linking individual-based and sample-based rarefaction, extrapolation and comparison of assemblages

    Get PDF
    Aims: In ecology and conservation biology, the number of species counted in a biodiversity study is a key metric but is usually a biased underestimate of total species richness because many rare species are not detected. Moreover, comparing species richness among sites or samples is a statistical challenge because the observed number of species is sensitive to the number of individuals counted or the area sampled. For individual-based data, we treat a single, empirical sample of species abundances from an investigator-defined species assemblage or community as a reference point for two estimation objectives under two sampling models: estimating the expected number of species (and its unconditional variance) in a random sample of (i) a smaller number of individuals (multinomial model) or a smaller area sampled (Poisson model) and (ii) a larger number of individuals or a larger area sampled. For sample-based incidence (presence-absence) data, under a Bernoulli product model, we treat a single set of species incidence frequencies as the reference point to estimate richness for smaller and larger numbers of sampling units. Methods: The first objective is a problem in interpolation that we address with classical rarefaction (multinomial model) and Coleman rarefaction (Poisson model) for individual-based data and with sample-based rarefaction (Bernoulli product model) for incidence frequencies. The second is a problem in extrapolation that we address with sampling-theoretic predictors for the number of species in a larger sample (multinomial model), a larger area (Poisson model) or a larger number of sampling units (Bernoulli product model), based on an estimate of asymptotic species richness. Although published methods exist for many of these objectives, we bring them together here with some new estimators under a unified statistical and notational framework. This novel integration of mathematically distinct approaches allowed us to link interpolated (rarefaction) curves and extrapolated curves to plot a unified species accumulation curve for empirical examples. We provide new, unconditional variance estimators for classical, individual-based rarefaction and for Coleman rarefaction, long missing from the toolkit of biodiversity measurement. We illustrate these methods with datasets for tropical beetles, tropical trees and tropical ants. Important Findings: Surprisingly, for all datasets we examined, the interpolation (rarefaction) curve and the extrapolation curve meet smoothly at the reference sample, yielding a single curve. Moreover, curves representing 95% confidence intervals for interpolated and extrapolated richness estimates also meet smoothly, allowing rigorous statistical comparison of samples not only for rarefaction but also for extrapolated richness values. The confidence intervals widen as the extrapolation moves further beyond the reference sample, but the method gives reasonable results for extrapolations up to about double or triple the original abundance or area of the reference sample. We found that the multinomial and Poisson models produced indistinguishable results, in units of estimated species, for all estimators and datasets. For sample-based abundance data, which allows the comparison of all three models, the Bernoulli product model generally yields lower richness estimates for rarefied data than either the multinomial or the Poisson models because of the ubiquity of non-random spatial distributions in nature. © 2012 The Author. Published by Oxford University Press on behalf of the Institute of Botany, Chinese Academy of Sciences and the Botanical Society of China. All rights reserved
    • …
    corecore